Speech recognition under musical environments using kalman filter and iterative MLLR adaptation
نویسندگان
چکیده
In this paper, we propose a speech recognition method under non-stationary musical environments using Kalman ltering speech signal estimation method and iterative unsupervised MLLR(Maximum Likelihood Linear Regression) adaptation. Our proposing method estimates the speech signal under non-stationary noisy environments such a s m usical background by applying speech state transition model to Kalman ltering estimation. The speech state transition model represents the state transition of speech component in non-stationary noisy speech and is modeled by using Taylor expansion. In this model, the state transition of noise component is estimated by using linear predictive estimation. Furthermore , to obtain higher recognition accuracy, w e consider to adapt the acoustic models by using iterative unsuper-vised MLLR adaptation to speech spectra distorted by Kalman ltering residual noise. In order to evaluate the proposed method, we carried out large vocabulary continuous speech recognition experiments under 3 types of music. As a result, the proposed method obtained the signiicant improvement in word accuracy, from 20.04% to 64.43% at 0dB SNR.
منابع مشابه
Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملEvaluation of noisy speech recognition based on noise reduction and acoustic model adaptation on the Aurora2 tasks
In this paper, we have evaluated a noisy speech recognition method based on noise reduction and acoustic model adaptation, on the AURORA2 tasks. For noise reduction method, we employed two noise reduction methods. One is an Adaptive Sub-Band Spectral Subtraction (ASBSS) method which can vary noise subtraction rate according to SNR in frequency bands at each frame. The other is a Kalman filterin...
متن کاملSpeech recognition in a reverberant environment using matched filter array (MFA) processing and linguistic-tree maximum likelihood linear regression (LT-MLLR) adaptation
Performance of automatic speech recognition systems trained on close talking data su ers when used in a distant talking environment due to the mismatch in training and testing conditions Microphone array sound capture can reduce some mismatch by removing ambi ent noise and reverberation but o ers insu cient im provement in performance However using array sig nal capture in conjunction with Hidd...
متن کاملSpeaker adaptation in the Philips system for large vocabulary continuous speech recognition
The combination of Maximum Likelihood Linear Regression (MLLR) with Maximum a posteriori (MAP) adaptation has been investigated for both the enrollment of a new speaker as well as for the asymptotic recognition rate after several hours of dictation. We show that a least mean square approach to MLLR is quite e ective in conjunction with phonetically derived regression classes. Results are presen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001